30 research outputs found

    Application of Machine Learning within Visual Content Production

    Get PDF
    We are living in an era where digital content is being produced at a dazzling pace. The heterogeneity of contents and contexts is so varied that a numerous amount of applications have been created to respond to people and market demands. The visual content production pipeline is the generalisation of the process that allows a content editor to create and evaluate their product, such as a video, an image, a 3D model, etc. Such data is then displayed on one or more devices such as TVs, PC monitors, virtual reality head-mounted displays, tablets, mobiles, or even smartwatches. Content creation can be simple as clicking a button to film a video and then share it into a social network, or complex as managing a dense user interface full of parameters by using keyboard and mouse to generate a realistic 3D model for a VR game. In this second example, such sophistication results in a steep learning curve for beginner-level users. In contrast, expert users regularly need to refine their skills via expensive lessons, time-consuming tutorials, or experience. Thus, user interaction plays an essential role in the diffusion of content creation software, primarily when it is targeted to untrained people. In particular, with the fast spread of virtual reality devices into the consumer market, new opportunities for designing reliable and intuitive interfaces have been created. Such new interactions need to take a step beyond the point and click interaction typical of the 2D desktop environment. The interactions need to be smart, intuitive and reliable, to interpret 3D gestures and therefore, more accurate algorithms are needed to recognise patterns. In recent years, machine learning and in particular deep learning have achieved outstanding results in many branches of computer science, such as computer graphics and human-computer interface, outperforming algorithms that were considered state of the art, however, there are only fleeting efforts to translate this into virtual reality. In this thesis, we seek to apply and take advantage of deep learning models to two different content production pipeline areas embracing the following subjects of interest: advanced methods for user interaction and visual quality assessment. First, we focus on 3D sketching to retrieve models from an extensive database of complex geometries and textures, while the user is immersed in a virtual environment. We explore both 2D and 3D strokes as tools for model retrieval in VR. Therefore, we implement a novel system for improving accuracy in searching for a 3D model. We contribute an efficient method to describe models through 3D sketch via an iterative descriptor generation, focusing both on accuracy and user experience. To evaluate it, we design a user study to compare different interactions for sketch generation. Second, we explore the combination of sketch input and vocal description to correct and fine-tune the search for 3D models in a database containing fine-grained variation. We analyse sketch and speech queries, identifying a way to incorporate both of them into our system's interaction loop. Third, in the context of the visual content production pipeline, we present a detailed study of visual metrics. We propose a novel method for detecting rendering-based artefacts in images. It exploits analogous deep learning algorithms used when extracting features from sketches

    Fast Blue-Noise Generation via Unsupervised Learning

    Get PDF
    —Blue noise is known for its uniformity in the spatial domain, avoiding the appearance of structures such as voids and clusters. Because of this characteristic, it has been adopted in a wide range of visual computing applications, such as image dithering, rendering and visualisation. This has motivated the development of a variety of generative methods for blue noise, with different trade-offs in terms of accuracy and computational performance. We propose a novel unsupervised learning approach that leverages a neural network architecture to generate blue noise masks with high accuracy and real-time performance, starting from a white noise input. We train our model by combining three unsupervised losses that work by conditioning the Fourier spectrum and intensity histogram of noise masks predicted by the network. We evaluate our method by leveraging the generated noise for two applications: grayscale blue noise masks for image dithering, and blue noise samples for Monte Carlo integration

    MR-RIEW: An MR Toolkit for Designing Remote Immersive Experiment Workflows

    Get PDF
    We present MR-RIEW, a toolkit for virtual and mixed reality that provides researchers with a dynamic way to design an immersive experiment workflow including instructions, environments, sessions, trials and questionnaires. It is implemented in Unity via scriptable objects, allowing simple customisation. The graphic elements, the scenes and the questionnaires can be selected and associated without code. MR-RIEW can save locally into the headset and remotely the questionnaire's answers. MR-RIEW is connected to Google Firebase service for the remote solution requiring a minimal configuration

    Ubiq-Genie: Leveraging External Frameworks for Enhanced Social VR Experiences

    Get PDF
    This paper describes the Ubiq-Genie framework for integrating external frameworks with the Ubiq social VR platform. The proposed architecture is modular, allowing for easy integration of services and providing mechanisms to offload computationally intensive processes to a server. To showcase the capabilities of the framework, we present two prototype applications: 1) a voice- and gesturecontrolled texture generation method based on Stable Diffusion 2.0 and 2) an embodied conversational agent based on ChatGPT. This work aims to demonstrate the potential of integrating external frameworks into social VR for the creation of new types of collaborative experiences

    Shall I describe it or shall I move closer? Verbal references and locomotion in VR collaborative search tasks

    Get PDF
    Research in pointing-based communication within immersive collaborative virtual environments (ICVE) remains a compelling area of study. Previous studies explored techniques to improve accuracy and reduce errors when hand-pointing from a distance. In this study, we explore how users adapt their behaviour to cope with the lack of accuracy during pointing. In an ICVE where users can move (i.e., locomotion) when faced with a lack of laser pointers, pointing inaccuracy can be avoided by getting closer to the object of interest. Alternatively, collaborators can enrich the utterances with details to compensate for the lack of pointing precision. Inspired by previous CSCW remote desktop collaboration, we measure visual coordination, the implicitness of deixis’ utterances and the amount of locomotion. We design an experiment that compares the effects of the presence/absence of laser pointers across hard/easy-to-describe referents. Results show that when users face pointing inaccuracy, they prefer to move closer to the referent rather than enrich the verbal reference

    Mitigation strategies for participant non-attendance in VR remote collaborative experiments

    Get PDF
    COVID-19 led to the temporary closure of many HCI research facilities disrupting many ongoing user studies. While some studies could easily move online, this has proven problematic for virtual reality (VR) studies. The main challenge of remote VR study is the recruitment of participants who have access to specialized hardware such as head-mounted displays. This challenge is exacerbated in collaborative VR studies, where multiple participants need to be available and remotely connect to the study simultaneously. We identify the latter as the worst-case scenario regarding resource wastage and frustration. Across two collaborative user studies, we identified the personal connection between the experimenter and the participant as a critical factor in reducing non-attendance. We compare three recruitment strategies that we have iteratively developed based on our recent experiences. We introduce a metric to quantify the cost for each recruitment strategy, and we show that our final strategy achieves the best metric score. Our work is valuable for HCI researchers recruiting participants for collaborative VR remote studies, but it can be easily extended to every remote experiment scenario

    VR Toolkit for Identifying Group Characteristics

    Get PDF
    Visualising crowds is a key pedestrian dynamics topic, with significant research efforts aiming to improve the current state-of-the-art. Sophisticated visualisation methods are a standard for modern commercial models, and can improve crowd management techniques and sociological theory development. These models often define standard metrics, including density and speed. However, modern visualisation techniques typically use desktop screens. This can limit the capability of a user to investigate and identify key features, especially in real time scenarios such as control centres. Virtual reality (VR) provides the opportunity to represent scenarios in a fully immersive environment, granting the user the ability to quickly assess situations. Furthermore, these visualisations are often limited to the simulation model that has generated the dataset, rather than being source-agnostic. In this paper we implement an immersive, interactive toolkit for crowd behaviour analysis. This toolkit was built specifically for use within VR environments and was developed in conjunction with commercial users and researchers. It allows the user to identify locations of interest, as well as individual agents, showing characteristics such as group density, individual (Voronoi) density and speed. Furthermore, it was used as a data-extraction tool, building individual fundamental diagrams for all scenario agents, and predicting group status as a function of local agent geometry. Finally, this paper presents an evaluation of the toolkit made by crowd behaviour experts
    corecore